Guaranteed Bounds for General Approximate Dynamic Programming

机译：一般近似动态规划的保证界

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we will develop a systematic approach to deriving guaranteedbounds for approximate dynamic programming (ADP) schemes in optimal controlproblems. Our approach is inspired by our recent results on bounding theperformance of greedy strategies in optimization of string-submodular functionsover a finite horizon. The approach is to derive a string-submodularoptimization problem, for which the optimal strategy is the optimal controlsolution and the greedy strategy is the ADP solution. Using this approach, weshow that any ADP solution achieves a performance that is at least a factor of$\beta$ of the performance of the optimal control solution, which satisfiesBellman's optimality principle. The factor $\beta$ depends on the specific ADPscheme, as we will explicitly characterize. To illustrate the applicability ofour bounding technique, we present examples of ADP schemes, including thepopular rollout method.

机译：在本文中，我们将开发一种系统的方法来为最优控制问题中的近似动态规划（ADP）方案推导保证边界。我们的方法受到我们最近的研究的启发，该研究的结果是在有限的范围内限制贪婪策略在优化字符串次模块函数中的性能。该方法是导出一个字符串次模优化问题，该问题的最优策略是最优控制解决方案，贪婪策略是ADP解决方案。使用这种方法，我们表明，任何ADP解决方案都可以达到至少满足最佳控制解决方案性能的\ beta $的性能，这满足了贝尔曼的最优性原则。正如我们将明确描述的那样，因子$ \ beta $取决于特定的ADP方案。为了说明边界技术的适用性，我们介绍了ADP方案的示例，包括受欢迎的推广方法。

著录项

作者
Liu, Yajing; Chong, Edwin K. P.; Pezeshki, Ali; Moran, Bill;
展开▼
作者单位

展开▼
年度 2014
总页数
原文格式 PDF
正文语种 {"code":"en","name":"English","id":9}
中图分类

相似文献

外文文献
中文文献
专利

1. A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees [J] . de Farias DP, Van Roy B Mathematics of operations research . 2006,第3期

机译：具有性能保证的平均成本近似动态规划的成本成形线性程序
2. Performance Guarantees for Model-Based Approximate Dynamic Programming in Continuous Spaces [J] . Beuchat Paul Nathaniel, Georghiou Angelos, Lygeros John IEEE Transactions on Automatic Control . 2020,第1期

机译：在连续空间中基于模型的近似动态规划的性能保证
3. Performance Guarantee of an Approximate Dynamic Programming Policy for Robotic Surveillance [J] . M. Park, K. Kalyanam, S. Darbha, IEEE transactions on automation science and engineering . 2016,第2期

机译：机器人监视近似动态编程策略的性能保证
4. Bounded Real-Time Dynamic Programming: RTDP with monotone upper bounds and performance guarantees [C] . H. Brendan McMahan, Maxim Likhachev, Geoffrey J. Gordon International Conference on Machine Learning . 2005

机译：有界实时动态编程：具有单调的上限和性能保证的RTDP
5. Stochastic Dual Dynamic Programming and Backward Approximate Dynamic Programming with Integrated Crossing State Stochastic Models for Wind Power in Energy Storage Optimization [D] . Durante, Joseph L. 2020

机译：随机双动规范和倒退近似动态规划，具有集成交叉状态随机模型的蓄能优化
6. Solving the dynamic ambulance relocation and dispatching problem using approximate dynamic programming [O] . Verena Schmid -1

机译：用近似动态规划解决动态救护车的调动和调度问题
7. A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees [O] . Daniela Pucci de Farias, Benjamin Van Roy 2006

机译：具有性能保证的平均成本近似动态规划的成本线性程序

Guaranteed Bounds for General Approximate Dynamic Programming

摘要

著录项

相似文献

相关主题

期刊订阅